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Abstract 

Background: Finding an efficient metliod to solve the parameter estimation problem (inverse problem) for 
nonlinear biochemical dynamical systems could help promote the functional understanding at the system level for 
signalling pathways. The problem is stated as a data-driven nonlinear regression problem, which is converted into 
a nonlinear programming problem with many nonlinear differential and algebraic constraints. Due to the typical ill 
conditioning and multimodality nature of the problem, it is in general difficult for gradient-based local optimization 
methods to obtain satisfactory solutions. To surmount this limitation, many stochastic optimization methods have 
been employed to find the global solution of the problem. 

Results: This paper presents an effective search strategy for a particle swarm optimization (PSO) algorithm that 
enhances the ability of the algorithm for estimating the parameters of complex dynamic biochemical pathways. 
The proposed algorithm is a new variant of random drift particle swarm optimization (RDPSO), which is used to 
solve the above mentioned inverse problem and compared with other well known stochastic optimization 
methods. Two case studies on estimating the parameters of two nonlinear biochemical dynamic models have 
been taken as benchmarks, under both the noise-free and noisy simulation data scenarios. 

Conclusions: The experimental results show that the novel variant of RDPSO algorithm is able to successfully solve 
the problem and obtain solutions of better quality than other global optimization methods used for finding the 
solution to the inverse problems in this study. 



Background 

Evolutionary algorithms (EAs) have been widely used for 
data mining tasks in Bioinformatics and Computational 
Biology [1,2]. They are random search methods inspired 
by natural mechanisms existing in the biological world 
[1,2]. EAs originally comprised four types of paradigms, 
namely, genetic algorithms (GAs), genetic programming 
(GP), evolution strategies (ES), and evolutionary program- 
ming (EP), with GAs being the most popular one. Data 
analysis tools traditionally used in Bioinformatics were 
mainly based on statistical techniques, such as regression 
and estimation, and EAs played significant roles in hand- 
ling large biological data sets in a robust and computation- 
ally efficient manner [2]. 
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Currently, evolutionary computing techniques mostly 
comprise conventional EAs (GAs, GP, ES and EP), swarm 
intelligence algorithms, artificial immune systems, differ- 
ential evolution, as the main representative classes of evo- 
lutionary computing approaches [3]. Swarm intelligence is 
a class of evolutionary computing techniques simulating 
natural systems composed of many individuals that coor- 
dinate one another using decentralized control and self- 
organization. Two most influential and classical examples 
of swarm intelligence approaches are particle swarm opti- 
mization (PSO) and ant colony optimization (ACO) algo- 
rithms, which have been widely used in many different 
fields [3-7]. Particularly, PSO algorithms have shown their 
effectiveness in data mining tasks in bioinformatics due 
to their performance in solving difficult optimisation 
tasks [8-10]. 

Biochemical modelling can be considered a generic 
data-driven regression problem on the given experimen- 
tal data. The goal of biochemical modeling is to build the 
mathematical formulations that quantitatively describe 
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the dynamical behaviour of biochemical processes. For 
example, metabolic reactions are formulated as rate laws 
and described as a system of differential equations, the 
kinetic parameters of which are identified from a set of 
experimental data. Finding the solution of the parameter 
estimation problem, thus, plays a key role in building a 
dynamic model for a biochemical process, which, in turn, 
can help understand the functionality of the signalling 
pathways at the system level [11,12]. 

Since solving the inverse problem in biochemical process 
modelling involves a task of nonlinear programming, many 
numerical optimization methods have been used to deter- 
mine the parameters of biochemical models. These meth- 
ods can be generally classified into two categories, namely, 
local optimization methods and global optimization meth- 
ods [13]. The widely used local optimization tools for 
inverse problems are those based on gradient descent 
methods, the most popular being the Newton method [14]. 
This type of approaches, however, cannot be applied to 
non-smooth problems, since the objective functions of the 
problems are discontinuous or have discontinuous deriva- 
tives. Direct search methods, such as the Hooke-Jeeves 
method, the Needier-Mead simplex algorithm and the 
Downhill simplex algorithm, are also a kind of local opti- 
mization techniques that could be used to find a local 
minimum without the information from derivatives [13]. 
Normally, most local optimization approaches are used as 
single shooting methods. For each of them, the path of its 
optimization process leading to a final solution is deter- 
mined by the initial conditions for the state variables. 
Therefore, the algorithm will lead to a wrong minimum, 
particularly if the initial conditions depend on model para- 
meters. To overcome this shortcoming, one can adopt 
multiple shooting methods in which the time interval is 
partitioned and new initial conditions are used at the start 
of each time interval part [15]. The methods can offer the 
possibility to circumvent local optima by enlarging the 
parameter space during the optimization process. 

The aforementioned local search methods are generally 
less efficient for the inverse problems of biochemical mod- 
els, which are multimodal and high-dimensional. In order 
to solve these hard inverse problems efficiently, one can 
turn to global optimization methods, most of which incor- 
porate stochastic search strategies to prevent the search 
process from being stuck into the local optimal or subopti- 
mal solutions. The branch-and-bound approach is a global 
optimization method that converts the inverse problem 
into a convex optimization problem so that a global opti- 
mal solution can be obtained [16]. This method requires a 
finite search space that can be divided into smaller sub- 
spaces. A remarkable disadvantage is that it is applicable 
only if the lower and upper bounds of the objective func- 
tion can be computed. Simulated annealing (SA) can be 
effectively used for parameter estimation from time-course 



biochemical data as shown in [17]. However, it has a slow 
convergence speed and high computational cost, and is 
not easy to be parallelized. Genetic algorithms (GAs) 
represent a widely used global search technique that could 
be employed to predict the parameters of dynamic models 
[18]. Nevertheless, GAs are always complained of 
slow convergence speed and high computation cost. The 
evolutionary strategy (ES) approach showed its ability to 
successfully solve inverse problems in a performance com- 
parison made by Moles et al. [19] among a number of glo- 
bal optimization techniques on biochemical system 
identification problems. In contrast to SA, evolutionary 
algorithms, including ES and GAs, can be implemented as 
self-tuning methods and can be parallelizable, with the 
stochastic ranking evolutionary strategy (SRES) method 
being a very successful example [19-21]. Scatter search 
(SS) is known as a population-based random search 
approach that was proposed to identify the appropriate 
parameters for nonlinear dynamic biological systems 
[22,23]. As an evolutionary algorithm method, the SS 
method, as well as its hybrid with a local search step after 
the recombination operation, showed to be efficient in sol- 
ving inverse problems. Particle swarm optimization (PSO), 
also a population-based optimization technique from 
swarm intelligence and evolutionary computation area, 
has demonstrated its better performance than GAs in sol- 
ving inverse problems [24,25]. Hybrids of PSO with other 
methods have also shown their effectiveness in modelling 
biochemical dynamic systems [26-28]. However, PSO 
shows to be sensitive to the neighbourhood topology of 
the swarm, as commented in [29]. 

Other methods for parameter estimation include the 
Newton-flow analysis [30], the alternating regression 
technique [31], decoupling approaches [32,33], the collo- 
cation method [20,34], the decomposing method [35,36]. 
These approximation techniques, when incorporated into 
an optimization algorithm, can help reduce the number 
of objective function evaluations, which are very compu- 
tationally expensive. Additionally, radial basis function 
neural networks [37] and a quantitative inference method 
[38] have also been employed to solve inverse problems 
in biochemical process modelling. 

In all of the above cases, the optimization approach is 
used to minimize to the residual error of an inferred 
model against experimental data. Smaller error means that 
the model describes the dynamic behaviour of the bio- 
chemical system better and has more justification to be 
accepted as a valid mathematical representation of the sys- 
tem. Theoretically, the prediction error diminishes with 
the accuracy of the model increasing. This study focuses 
on developing an efficient optimization method for para- 
meter estimation of a given dynamic biochemical model. 
However, since parameter estimation problems of com- 
plex dynamic systems (generally with many parameters 
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and many differential equations) are high-dimensional, 
multimodal and more challenging to solve, but which 
allow to depict more complex biochemical processes, our 
goal in this study is to develop an efficient global optimiza- 
tion method for solving such inverse problems of complex 
biochemical dynamic systems. 

After extensive and in-depth study, we selected the PSO 
algorithm as a candidate to be modified in order to 
achieve our goal of solving complex inverse problems. The 
reason why PSO attracted us is that PSO has many advan- 
tages, such as faster convergence speed, lower computa- 
tional need, as well as being easily parallelizable and 
having fewer parameters to adjust. However, PSO has the 
following shortcomings. First of all, it was theoretically 
proven that the PSO is not a global convergent algorithm, 
even not a local convergent one, against the convergence 
criteria given in [39]. Practically, the algorithm is more 
prone to be trapped into local optimal or suboptimal 
points for a high- dimensional problem, due to the weak- 
ening of its global search ability during the mid and later 
stages of the search process. Next, PSO is widely known to 
be sensitive to its search parameters including upper limits 
of the velocity, and even to the "swarm topology", so that 
users may feel awkward when selecting the parameters 
and the topologies when using the algorithm [40]. Finally, 
the performance of PSO appears to be very sensitive to 
the setting of upper and lower bounds of the search scope 
[40]. If the global optimal solution is located near the 
boundary of the search scope, the algorithm may have lit- 
tle chance to catch it. We have found that these shortcom- 
ings are mainly attributed to the velocity update equation, 
which is the essence of the PSO algorithm, and where it 
seems to be much room for improvement so as to boost 
the global search ability of the PSO. 

In this study, inspired by the free electron model in 
metal conductors placed in an external electric field [41], 
we propose to use a variant of the PSO algorithm, called 
the random drift particle swarm optimization (RDPSO), in 
order to achieve our goal of effectively estimating the para- 
meters of complex biochemical dynamical systems. The 
motivation of the RDPSO algorithm is to improve the 
search ability of the PSO algorithm by fundamentally 
modifying the update equation of the particle's velocity, 
instead of by revising the algorithm based on the original 
equation so as to probably increase the complexity of the 
algorithmic implementation as well as its computational 
cost. It is different from the drift particle swarm optimiza- 
tion (DPSO) proposed by us in [42,43] in that it can make 
a better balance between the global search and the local 
search of the particle swarm. 

The original and basic RDPSO version was recently 
introduced by us in [44], which was used for solving other 
problems in [45,46]. A novel variant of RDPSO algorithm 



is being proposed in this work to solve the parameter 
identification problem for two biochemical systems. The 
novel variant proposed here is different from the original 
one in that it employs an exponential distribution for sam- 
pling the velocity of the particles, whilst the original one 
used the Gaussian distribution. 

The novel RDPSO variant is used for estimating the 
parameters of two benchmark models, one of which 
describes the thermal isomerization of a-pinene with 5 
parameters [22,47], the other of which has a three-step 
pathway with 36 parameters [19]. The results of RDPSO 
and some other well-known global optimization algo- 
rithms are then compared and discussed. It should be 
noted that although this paper is focused on the parameter 
estimation for biochemical modelling, just as PSO and 
other EAs, the proposed RDPSO variant can be employed 
as a general-purpose tool for optimization problems in 
data miming tasks, such as clustering, classification, 
regression, and so forth, which widely exist in bioinfor- 
matics and computational biology [1,2,8,48,49]. 

Methods 

Problem statement 

The inverse problem of a nonlinear dynamic system 
involves finding proper parameters so as to minimize the 
cost function of the model with respect to an experimen- 
tal data set, with some given differential equality con- 
straints as well as other algebraic constraints. Such a 
data-driven regression problem can be approached with 
statistical techniques, using the given experimental data 
and the proposed models with unknown parameters. As 
stated by Moles et al. [19], the problem can be mathema- 
tically formulated as a nonlinear programming problem 
(NLP) with differential-algebraic constraints, whose goal 
is to find 0 so as to minimize 

tf 

} = f (ymsd(0 - Yio, £))^w(t)(ynisd(0 - y{o, t))dt (i) 

0 

subject to 

/(f,x,y,«,v,,)=0 ,2) 

x(to) = xo (3) 
h(x,y,6l,0 =0 (4) 
g(x,y,0,t)<O (5) 

e'- <e <9" (6) 
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where / is the cost function of the model, 0 is a vector 
of model parameters to be estimated, ymsd(C) is the 
experimental measure of a subset of the output state vari- 
ables, y{0, t) is the prediction of those outputs by the 
model, X is the differential state variables and v is a vector 
of other (usually time-invariant) parameters that are not to 
be estimated. In Equation (1), W{t) is the weighting (or 
scaling) matrix, and the equation can be discretized into a 
weighted least-square estimator. In Equation (2), /is the 
set of differential and algebraic equality constraints 
describing the system dynamics (i.e., the nonlinear process 
model). Equation (3) gives the initial value of x. In Equa- 
tions (4) and (5), h and g are equality and inequality path 
and point constraints on system performance. In addition, 
0 is subject to upper and lower bounds, which are 
described by inequality constraints (6). 

The above defined inverse problem is generally a multi- 
modal (non-convex) optimization problem with multiple 
local optima due to the nonlinearity and constraints of 
the system dynamics. Even though many local and global 
optimization methods have been proposed to solve the 
problem as mentioned in Introduction, it is still challen- 
ging and very necessary to develop efficient optimization 
algorithms to deal with the parameter estimation pro- 
blems, especially those for the dynamic systems with 
many parameters and many equations. Therefore this 
study focuses on the optimization approach for the 
inverse problem using the proposed variant of random 
drift particle swarm optimization (RDPSO) and other 
global optimization methods. 

Particle swarm optimization 

The original PSO algorithm was introduced by Kennedy 
and Eberhart in [50]. The algorithm was inspired by the 
observed social behavior of bird flocks or fish schooling, 
and it roots its methodology both in evolutionary com- 
puting and artificial life. It shares many similarities with 
EAs, in that both the PSO and the EAs are initialized 
randomly with a population of candidate solutions and 
then update the population iteratively, in order to 
approximate the global optimal solution to the given 
problem. However, unlike EAs, PSO has no evolution 
operators such as crossover and mutation, but perform 
optimization tasks by updating the particles' position 
(potential solutions) according to a set of discrete differ- 
ential equations. It was shown that the PSO algorithm 
has comparable and even better performance than 
GAs [51]. 

In the PSO with m particles, each particle i 
(l<i<m), representing a potential solution of 
the given problem in a D-dimensional space, has 
three vectors at the k^^ iteration, namely, the 
current position xf = (Xj^irXj^j- • • • '^Id)' the velocity 



^ = (^i'^i-2'-- - 'Pi.o) and its personal best (pbest) 
position Pf = (Pfi,f^2'--- 'Pi.o)' which is defined as 
the position with the best objective function value 
found by the particle since initialization. A vector 
G*" = (Gj, G2, • • • , Go), called the global best igbest) posi- 
tion, is used to record the position with the best objective 
function value found by the all the particles in the particle 
swarm since initialization. With the above specification, 
the update equations for each particle's velocity and 
current position are given by: 

x|-i=x|^ + v*;i (8) 

fori = 1, 2, • • • m;j = 1, 2 • • • ,D, where Ci and C2 are 
known as acceleration coefficients and w is called the 
inertia weight, which can be adjusted to balance the 
exploration and exploitation ability of each particle [52]. 
Without loss of generality, we assume that the PSO is 
used to solve the following minimization problem: 

Minimize f{X) , s. £. X e S c R^, (9) 

where /(X) is an objective function (or fitness func- 
tion) and S denotes the feasible space. Consequently, 
can be updated by: 

^ jxf if/(xf)</(pf-i) 

and (j' can be found by: 

= i^, where g = argmin|/(Pf)] 

l<i<m 

In Equation (7), rjj and R\j are the sequences of two 
different random numbers with uniform distribution on 

the interval (0, 1), namely, r^^, R-j~L7(0, 1). In order to 

prevent the particle from flying away out of the search 

scope, Vy is restricted on the interval [— Vmax/ V^max], 

where Vmax is also a user-specified algorithmic 
parameter. 

Many researchers have proposed different variants of 
PSO in order to improve the search performance of the 
algorithm and proved this through empirical simulation 
[3-7,52-56]. 

The proposed random drift particle swarm optimization 

In [57], it was demonstrated that if the acceleration coeffi- 
cients are properly valued, each particle converges to its 

local attractor, = (Pn, ^^2' ' ' ' P?d)' that the conver- 
gence of the whole particle swarm can be achieved. Each 
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coordinate of is given by: 



cir':,P'!, + C2R'',G'' 



hi t,i 



hi ] 



which can be restated as 



, 1 < j < D 



j < D 



where 



4 = 



cut; + ^^^li 



(12) 



(13) 



(14) 



In the PSO algorithm, Ci and C2 are set to be equal, 
is a random number with uniform distri- 



Equation (7). This velocity update equation appears to add 
some effectiveness to the search performance of the parti- 
cle swarm but has some shortcomings. Firstly, the 



Gaussian distribution has a thin tail so that it has less 
opportunity to generate outliers. As a result, the thermal 
motion of the particle has less randomness and cannot 
significantly improve the particle's global search ability. 

Secondly, although the update equation of Vl^*^ : 



(15) 



and 



and thus 

■ "'J 

bution on the interval (0, 1), i.e. (^|j~L7(0, 1). 

During the search process of the PSO, as particles' cur- 
rent position are converging to their own local attractor, 
their current positions, pbest positions, local attractors 
and the gbest positions are all converging to one single 
point. The directional movement of each particle 
i towards resembles the drift motion of an electron in 
metal conductors placed in an external electric field. 
According to the free electron model [37], the electron 
has not only drift motion caused by the external electric 
field, but also a thermal motion, which appears to be a 
random movement. The superposition of the drift 
thermal motions makes the electron careen towards the 
location of the minimum potential energy. Thus, if the 
position of an electron in the metal is regarded as a 
candidate solution and the potential energy function as 
the objective function to be minimized, the movement of 
the electron resembles the process finding the minimum 
solution of the minimization problem. 

The above facts can lead to a novel variant of PSO if the 
particle in PSO is assumed to behave like an electron mov- 
ing in a metal conductor in an external electric field. More 
specifically, it can be assumed that at the kth. iteration, 
each particle i has drift motion towards as well as a 
thermal motion, with their velocities in each dimension / 
denoted as and V2''|\ respectively. As a result, the 

velocity of the particle is given by V^*^ = Vl^|i + V2\-^ . 
In the drift particle swarm optimization proposed in 
[42,43], we assume that Vly ^ follows a Maxwell distribu- 
tion, say a Gaussian probability distribution, and the 

V2\*^ is given by the social part plus the cognitive part in where 



can guarantee the particle to converge towards its local 
attractor, the two random scaling coefficients add random- 
ness to its motion, which means that the particle's position 
is sampled at uniformly random positions within the 
hyper-rectangle around the gbest position and its personal 
best position. It is not able to enhance the particle's 
global search ability because of the finite scope of the 
hyper-rectangle, but it may weaken its local search ability, 
which is the responsibility of the directional motion. 
Therefore, the velocity of the particle, which is given by 

the sum of V^^j^ and V2^|\ may not be able to make a 

good balance between the global search and the local 
search of the particle. In the present study, we employ a 
new way of determining Vly^ and V2\*^ . 
Here, we assume that the velocity of the thermal 

motion Vl^|^ follows a double exponential distribution, 

whose probability density function and probability distri- 
bution function are 



-l\v\ 



(16) 



-2|v| 



Fyife+i {v) = \ —e 



(17) 



respectively, where v represents the value of the ran- 
dom variable VI and ct^ is the standard deviation of 
the distribution. By employing a stochastic simulation 
method, we can express Vly ^ as 



hi 9 



+ \n(l/ul^ ifs> 0.5 
-ln(l/u'lj^ ifs< 0.5 



(18) 



(19) 



where s and Uy are two different random numbers 
uniformly distributed on the interval (0,1), i.e. 
= (C?,C^,--- ,C^). As for the value of a^, an 
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adaptive strategy is adopted to determine (Xy by 

or^- = 2a|C* - Xlj\, where & = {c\, C\,-- - , C^) is called 
the mean best {mbest) position, defined as the mean 
of the pbest positions of all the particles, i.e. 
(^ = [Vrn)j:Z,I*j{l<j<D). 

The velocity of the drift motion V2^J^ may have many 
possible forms. However, the following simple linear 
expression is adopted in this study: 

V2^j'=P{plj-Xlj) (20) 

where is determined by 

Pti = 4Ai + (1 - 'f'i.i^' 'P'lr^iO, 1), 1 < 7 < D (21) 

It can be immediately proven that if v'-J^ = V2-|\ 
when fe ^ CO , X^j p^j . Therefore the expression of 
Vl'l^^ in Equation (20) can indeed guarantee that the 

particle move directionally to as an overall result. 
With the definitions of the thermal motion and the 

drift motion of the particle, we can obtain a novel set of 
update equations for the particle: 

Vfj' = a|Cf - Xlj\cj,l^ + fiif^^j - X^) (22) 

4;'=^ + ^' (23) 

where a is called the thermal coefficient and p is 
called the drift coefficient. The PSO with Equations (22) 
and (23) is a novel variant of RDPSO, which employs a 
double exponential distribution instead of a Gaussian 
one. The procedure of this RDPSO variant is outlined 
below. 

Step 0: Randomly initialize the current positions and 
the pbest position of all the particles; 
Step 1: Set A: = 0; 

Step 2: While the termination condition is not met, do 
the following steps; 

Step 3: Set k=k+l and compute the mbest position , 
which is the centroid of the pbest positions of all the 
particles at iteration k; 

Step 4: From j = 1 , carry out the following steps; 

Step 5: Evaluate the objective function value /(X[') , 
and update Pj' and according to Equation (10) and 

Equation (11), respectively; 

Step 6: Update the components of the velocity and 
current position of particle i in each dimension, respec- 
tively, according to Equations (21), (22) and (23); 

Step 7: Set i=i+l, and return to Step 5 until i = m; 

Step 8: Return to Step 2; 



In the RDPSO algorithm, in addition to the population 
size m, a and are two very important user-specified 
algorithmic parameters, which play the same roles as the 
inertia weight w and acceleration coefficients in the basic 
PSO algorithm. That is, the can be tuned to balance the 
exploration and exploitation ability of the particle. How to 
select the values of these parameters to prevent the parti- 
cles from explosion is an open problem. Here, we per- 
formed the stochastic simulations for the one dimensional 
case, in which the local attractor was fixed at the origin 
and the mbest position was at ;x: = 0.1. The results of two 
simulations are visualized in Figure 1 and Figure 2, in 
which the logarithmic value of the absolute of X^ was 
recorded as ordinate, and the iteration number was the 
abscissa. Figure 1 shows that the particle's position was 
bounded when a = I and = 1.5. However, when 
a = 1.8 and yS = 1.5, the particle diverged to infinity. To 
obtain the sufficient and necessary condition for the parti- 
cle to be bounded, we will focus our attention on a theore- 
tical analysis in terms of probability measure in future. 

Setting large values for a and /i implies better global 
search ability of the algorithm, while setting small 
values means better local search. When the RDPSO is 
used for solving a problem, a good balance between the 
global search and the local search of the algorithm is 
crucial for the algorithmic performance. However, in 
order to find out how to tune the parameters to gener- 
ate generally good algorithmic performance we need a 
large number of experiments on benchmark functions, 
which will be performed in our future tasks. Here, we 
recommend that when the RDPSO is used, a should be 
set to be no larger than 1.0 and j8 to be no larger than 
1.5. More specifically, when the problem at hand is 
complex, the values of the two parameters should be set 
to be relatively large in order to make the particles 
search more globally, and on the other hand, when the 
problem is simple, relatively smaller values should be 
selected for the parameters, for the purpose of faster 
convergence speed of the algorithm. In the present 
study, the value of a and fi were set to be 0.75 and 
1.0, respectively. 

In addition, the population size and the maximum 
number of iterations (Maxlter) can also affect the per- 
formance of a population-based technique. Just as for 
other PSO variants, it is suggested that the population 
size should be larger than 20 for the RDPSO as well. 
The value of the Maxlter depends on the complexity of 
the problem. Generally, a smaller Maxlter value is used 
for simple problems, while a larger one is used for com- 
plex problems. 

Moreover, is also restricted within the interval 

[— Vmax/ Vmax] during the search process of the RDPSO 
algorithm, just as in the original PSO algorithm. 
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Figure 1 The result of the stochastic simulation with a = 1 and y6 = 1.5. It is sliown tliat, with tills parameter setting, the particle's 
position is bounded. 



The optimization methods compared 

Besides the PSO and RDPSO algorithms, the Differential 
Evolution (DE), Scatter Search (SS) method and two ver- 
sions of Evolution Strategies were also used to solve the 
selected inverse problems, for performance comparison 



purposes. The DE method, as presented by Storn and 
Price [58], is an evolutionary computing method, which 
has a faster convergence speed than GAs and can find 
the global optimal solution of a multidimensional and 
multimodal function effectively [58]. 
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Figure 2 The result of the stochastic simulation with a = 1.8 

position diverges as the iteration number Increases. 
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and p = 1.5. It is shown that, with this parameter setting, the particle's 
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The SS method is also a population-based search techni- 
ques originally developed by Glover [59]. In [22], a novel 
meta-heuristic method, which is the combination of the 
original SS method with a local search technique, was pro- 
posed to solve inverse problems. It was shown that the 
local search operator can accelerate the convergence 
speed significantly. Thus, in our experiments, we used this 
novel SS method for performance comparison. 

Evolutionary strategy (ES) is an important paradigm of 
EAs, which imitates the effects that genetics produces on 
the phenotype, rather than the genotype as in GAs [60]. 
The two canonical versions of ES we used in this study 
are denoted by {fi, A)-ES and {j4 + A)-ES, where ^ denotes 
the number of parents and A the number of offspring. In 
the {/A, A)-ES, the parents are deterministically selected 
from offspring {fi <A must hold), while in {ft + A)-ES, the 
parents are selected from both the parents and offspring. 

In addition, the performances of the above mentioned 
algorithms, including the RDPSO, are also compared 
with those of the SRES method. The SRES is a version 
of (ft, A)-ES that uses stochastic ranking to handle the 
constraints, by adjusting the balance between the objec- 
tive function and the penalty function on the course of 
the search [19], [61]. 



where (pi , p2 • ■ ■ / ps ) is the vector of unknown coeffi- 
cients to be estimated, yi, Yi, Yi, Y4 and Ys denote the 
concentrations of the a-pinene, dipentene, alloocimen, 
(3-pyronene and a dimer, respectively. 
Case study 2 

This case study involves the inverse problem to identify 
36 kinetic parameters of a nonlinear biochemical 
dynamic model formed by the following 8 ordinary dif- 
ferential equations that describe the variation of the 
metabolite concentrates with time 1 [19]. 




Case studies 

Two case studies involving two benchmark systems were 
carried out. For each system, we performed two groups 
of numerical experiments, one with noise-free simula- 
tion data, and the other with noisy simulation data. 
Case study 1 

The goal of this case study is to estimate the five rate con- 
stants of the homogeneous biochemical reaction describ- 
ing the thermal isomerization of a-pinene, which is an 
organic compound of the terpene class, one of two iso- 
mers of pinene [22], [23]. The mathematical model of this 
process is formulated with the following linear equations: 



dYi 
dt 

dyi 
dt 



piyi 



(24) 



(25) 



dEi V4 • Gi 



dt 


K4 


+ Gi 


dE2 


V5 


•G2 


dt 


K5 


+ G2 


dEs 


Ve 


•G3 



dt Ks + G3 



— fe4 • £1 



h ■ Ej 



fee • £3 



dt 



kcati ■ El ■ 



Km, 



(32) 



(33) 



(34) 



(S-Mi) 



1 + 



kcat2 ■ £2 



Ml 



Kmi Kmi 



1 



(35) 



• (Ml -M2) 



1 + 



Ml 



M2 



dY3 
dt 



■■ PlYl - {P3 + P4)y3 + P5F5 



dyi 
-df = 



dys 

— = P4Y3 - P5Y5 



(26) 



(27) 



(28) 



dMi 
dt 



kcat2 ■ E2 



kcats ■ £3 



(Ml - M2) 



Ml M2 
1 + + 



Km 5 



Knis 



(M2 - P) 



(36) 



M2 
1 + — - + 



Kms Krus 
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W;/ = { 1/ max [ypied(01il^ 

Wij = {1/ max [ypredCOlj}^' which was used to normaUze 
the contributions of each term [19]. 

Obtaining simulation data 

In order to evaluate the performances of the global optimi- 
zation methods in finding the solution of the inverse pro- 
blems, we chose a set of parameters for each model, which 
are considered as the true or nominal values. For Case 
Study 1, the true values of the parameters are pi = 5.93e-5, 
P2 = 2.96e-5, p3 = 2.05e-5, p4 = 27.5e-5 and ps = 4.00e-5. 
For Case Study 2, the nominal values of the model 
parameters are shown in Table 1. 

The pseudo-experimental data (essentially the simula- 
tion data) in either case were generated by substituting 
the chosen parameters into the dynamic model and per- 
forming fourth order Runge-Kutta method on the corre- 
sponding differential equations. For Case Study 2, the 
pseudo-measurements of the concentrations of metabo- 
lites, proteins, and messenger RNA species were the 
results of 16 different pseudo-experiments, in which, 
with the given nominal values for the parameters, the 
initial concentrates of the pathway substrate S and pro- 
duct P were varied for each experiment (simulation) as 
shown in Table 1. These simulated data represent the 
exact experimental results devoid of measurement noise 



Table 1 Experiment generation (simulation) and the nominal value. 



P 0.05 




0.13572 




0.36840 


1.0 


S 0.1 




0.46416 




2.1544 


10 


Parameter Element of decision variables vector 


Nominal value 


Parameter 


Element of decision variables vector 


Nominal value 


Vi 


e, 


1 


1/4 


019 


0.1 


Kh 


02 


1 


/C4 


020 


1 


nil 


03 


2 


k. 


021 


0.1 


/Co, 


04 


1 


1/5 


022 


0.1 




05 


2 


Ks 


023 


1 




06 


1 


ks 


024 


0.1 




07 


1 


Ve 


025 


0.1 


Kh 


08 


1 


K6 


026 


1 


nil 


09 
010 


2 
1 


ke 
kcau 


027 
028 


0.1 

1 


na2 


011 


2 


Kmy 


029 


1 


k2 


012 


1 


Km2 


030 


1 


V3 


013 


1 


kcati 


031 


1 


Ki, 


014 


1 


Km^ 


032 


1 


nk 


015 


2 


Km4 


033 


1 


Ka^ 


016 


1 


Keats 


034 


1 


na^ 


017 


2 


Krvs 


035 


1 




018 


1 




036 


1 



(P and S are initial concentrations of the pathway substrate and product, and were varied and combined to generate a total of 16 sets of pseudo-experimental 
(simulation) measurements) 



where Mi, M2, £1, E2, £3, Gi, G2 and G3 are the 
state variables representing the concentrations of the 
species involved in different biochemical reactions, and 

5 and P are controlling parameters which are kept fixed 
at the initial values for each experiment. The inverse 
problem is then reduced to the optimization problem 
that fits the remaining 36 parameters represented by 

6 = {61, 62, ■ ■ ■ , O^^). 

Objective functions 

The objective function (or fitness function) for the 
inverse problem in either of the two case studies is the 
discretization of Equation (1), which is formulated as 
the weighted sum of squares of the differences between 
the experimental and the predicted values of the state 
variables: 

n I 

^ = I] I] WydypredCO - yexp(l)]/ (37) 
1=1 j=l 

where n is the number of data for each experiment, 

/ is the number of experiments, Xexp is the vector of 
experimental values of the state variables, and Fpred is 
the vector of the values of state variables predicted by 
the model with a given set of parameters. In Case Study 
1, each was set to be 1 [22], while in Case Study 2, 
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Table 2 Configuration of search parameters in different algorithms (F is an algorithmic control parameter used to 
control the size of the perturbation in the mutation operator for DE, CR is the crossover constant in DE, varphi is a 
parameter determining the standard deviation of the mutation in the evolution strategy). 



Algorithm 


RDPSO 


SS method 


PSO 


DE 


(M, A)-ES 


{fj + A)-ES 


Search 


Population Size = 100 


initial Population Size = 100, 


Population Size = 100 


Population Size = 


1 00 A = 1 00 


A = 100 


parameters 


a = 0.75 


10 selected individual after 


w = 0.729 


F = 0.5 


fj = 10 


fj = 10 




P = 1.0 


initialization 


Cl = 1.49;c2 = 1.49 


CR = 0.55 


varphi = 1 


varphi = 1 



and they were used as noise-free data for the first group 
of numerical experiments in each case study. In order to 
test the optimization methods for noisy data, we added a 
white noise to each of the original noise-free data: 

z' = z + as (38) 

where z and z' represents the original noise-free data 
and the resulting noisy data, respectively, s is an ran- 
dom number with standard normal distribution, namely, 
e~N(0, 1), and cr is the standard deviation of the 
white noise. In our case studies, cr was set to 0.01 for 
both systems. 

Initial problem solver used 

During the search of each global optimization algorithm, 
each potential solution (i.e., a set of estimation values 
for the parameters) was substituted into the dynamic 
model. Then, the fourth order Runge-Kutta method was 
performed on the corresponding system of differential 
equations to generate a set of predicted values of the 
output state variables, from which the objective function 
value (or fitness value) of the potential solution could be 
evaluated according to Equation (37) with the obtained 
pseudo-experimental (simulation) data. This process is 
known as the solution to the forward problem, which 
was embedded in the iterations of the search during the 
solving of the inverse problem with the algorithm. 



Experimental settings 

For the sake of performance comparison, all the tested 
global optimization methods except the SS method (i.e., 
PSO, RDPSO, DE, ifi, A)-ES, and (^ + A)-ES) were pro- 
grammed in C++ on a VC++6.0 platform in Windows 
XP environment, and implemented on Intel Pentium 
Dual-Core E5300 2.6GHz PCs, each with 2 MB cache 
and 4 GB main memory. The SS method was imple- 
mented in Matlab 7.0, on the same platform, for the 
purpose of calling the local solver SQP in Matlab during 
the search process. The software for SS for inverse pro- 
blems can be found on http://www.iim.csic.es/~ging- 
proc/ssmGO.html. 

The configuration of the algorithm parameters including 
the population sizes are listed in Table 2. In Case Study 1, 
each optimization algorithm ran 20 times with each run 
executed for 500 iterations; that is, the maximum number 
of iterations (Maxlter) is 500. In Case Study 2, each algo- 
rithm also ran 20 times with each run executed for 
2250000 function evaluations, which is the same as that for 
the DE in [19]. Since the population size of each algorithm 
was 100, the value of Maxlter was 22500 in Case Study 2. 
For the SS method, the initial population size was 100, 
and 10 individuals were selected to perform the iterative 
search after initialization. Other parameters were selected 
according to recommendations from the corresponding 
references and/or our preliminary runs. For all the algo- 
rithms tested on the inverse problems, the statistical values 



Table 3 J values and computational time for the global optimization methods in Case Study 1. 



Results for the Experiments (simulations) with Noise-free Data 


Algorithms 


RDPSO 


SS method 


{fi. A)-ES 


PSO 


DE 


{(» + A)-ES 


Best Value of J 


1.3740e-14 


0.3930 


301.9941 


3.4461 e-005 


258.892 


3394941 


Mean Value of J 


2.5845e-004 


0.5703 


2.8522e+06 


0.0225 


2.3197e+03 


1.9103e+06 


Standard Deviation of J 


3.59788-04 


0.1348 


2.2673e+06 


0.0183 


704.7776 


2.2118e+06 


CPU time(h) 


0.0347 




0.0419 


0.0330 


0.0353 


0.0419 


Results for the Experiments (simulations) with Noisy Data 


Algorithms 


RDPSO 


SS method 


iti. A)-ES 


PSO 


DE 


(f( + A)-ES 


Best Value of J 


0,2023 


0.7195 


388.9242 


0.2028 


1.0273e+003 


52.481 7 


Mean Value of J 


0.2026 


0.8361 


2.1952e+05 


0.2083 


2.5488e+003 


8.2498e+05 


Standard Deviation of J 


4.291 8e-04 


0.0743 


3.2308e+05 


0.0126 


5393.6832 


1 .6245e+06 


CPU time(h) 


0.0349 




0.0423 


0.0338 


0.0351 


0.0422 
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0 50 100 150 200 250 300 350 400 450 500 
Figure 3 The figure visualizes the convergence curves of objective function values of all the algorithms averaged over 20 runs in the 
numerical experiments with noisy data for Case Study 1. It is shown that the RDPSO, PSO and SS methods had better convergence 

properties than other methods. 
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Figure 5 The figure shows the noisy experimental data (marlcer) versus the predicted model (continuous line) for Case Study 1. It is 

also shown that the predicted model obtained by RDPSO fits the experimental data well. 



of / were figured out and the results with best values of 
/ were selected, processed and visualized with Matlab 7.0. 

Results 

For Case Study 1, the statistical values of /from 20 search 
runs with 500 iterations by each algorithm are listed in 
Table 3. The best value of / (/ = 1.3740e-014) for the 
numerical experiments with noise-free simulation data was 
obtained by using our proposed RDPSO algorithm after 
running for 0.0348h (about 2 minutes). For the experiment 
with noisy data, the RDPSO generated the best J value (J = 



0.2023) as well. The proposed algorithm also showed the 
best performance on average among all tested methods, as 
shown by the mean value of / over 20 runs. In this case 
study, the basic PSO algorithm showed good perfor- 
mance on low-dimensional inverse problems. 

The convergence process of each tested algorithm aver- 
aged over 20 runs in the numerical experiment with noisy 
data in Case Study 1 is shown by the convergence curve in 
Figure 3, which is plotted in the log-log scale with objec- 
tive function values versus the iteration number. Evidently, 
the SS method showed a better convergence property than 



Table 4 J values and computational time of the global optimization methods for Case Study 2, Including the results 



for the noise-free and noisy data. 



Results for the Experiments with Noise-free Data 


Algorithms 


RDPSO 


SS method 


(f/, X)-ES 


PSO 


DE 


[fj + A)-ES 


Best Value of J 


0.009124 


71358e-07 


0.022858 


7.140163 


10168989 


0.123209 


Mean Value of J 


0.178881 


3.4.274e-06 


0.736311 


10.3859 


17701876 


2.141820 


Standard Deviation of J 


0.252749 


1 .3649e-06 


0.960729 


3.1927 


4.112377 


1.692726 


CPU time(h) 


524 




54.9 


49.2 


53.8 


53.3 


Results for the Experiments with Noisy Data 


Algorithms 


RDPSO 


SS method 


(f/, >l)-ES 


PSO 


DE 


{(J + A)-ES 


Best Value of J 


0.2313 


0.2337 


2.5957 


7.7433 


1 1 .7900 


5.1490 


Mean Value of J 


0.3459 


0.3106 


3.6029 


11.2353 


18.5928 


10.8691 


Standard Deviation of J 


0.1268 


0.1325 


0.1730 


3.2921 


3.3616 


3.7065 


CPU time(h) 


524 




54.9 


49.2 


53.8 


53.3 
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Table 5 Decision vector for the solution found by RDPSO 
in the experiment with noise-free data for Case Study 2. 



Elements of best vector 




0.890001 


0.996928 


1 .990488 


1 .000000 




1 .998655 


0.885005 


1 .000085 


1.000213 




1 .898600 


0.999900 


2.002120 


1.010091 


S-i 3-f 1 6 


0.998763 


1.005851 


2.000021 


0.995512 


S-i 7-^20 


2.000150 


0.996318 


0.100025 


1.000013 




0.100211 


0.099875 


1.000361 


0.100021 


62^-828 


0.100004 


1 .000300 


0.100008 


1.000750 


829-832 


1.000321 


0.987620 


1.000041 


1.000035 


833-836 


0.997855 


0.998856 


1 .000000 


1.000001 


Table 6 Decision vector for the solution found by RDPSO 
in the experiment with noisy data for Case Study 2. 


Elements of best vector 


8,-84 


1.0128 


0.9962 


1 .9923 


1 .0220 


85-83 


1 .9698 


1 .0057 


1 4290 


0.7919 


8g-8r2 


1 .8388 


1.6173 


1 .4905 


0.9384 


8, 3-8] 6 


1.2683 


1.0180 


1 .3832 


0.9271 


8-\ 7-820 


2.0248 


1.3612 


0.1242 


1 4658 


82^-824 


0.0956 


0.1101 


0.5970 


0.1482 


825-82S 


0.0979 


1.0012 


0.0977 


0.9916 


829-832 


1.3351 


1 .9900 


1.3957 


1 .9209 


833-836 


1 .5504 


1.3801 


1 .6637 


1.1668 



other algorithms. The best solution vector corresponding 
to the best value of / (/ = 1.3740e-014) obtained by the 
RDPSO in the numerical experiment with noise-free 
data was = 5.930000e-005, p2 = 2.960000e-005, 
P3 = 2.750002e-004, p^ = 2.750002e-004, ps = 4.000561e- 
005, extremely close to the real values of the parameters, 
and the best solution vector obtained by the RDPSO for 
the numerical experiment with noisy data was (when 
/ = 0.2023) pi = 5.928818e-005, p2 = 2.959918e-005, 
P3 = 1.712580e-005, = 3.147252e-004, /jg = 1.537512e- 
003. Figure 4 and Figure 5 show good fits between the 
experimental data (simulation data) and the predicted 
data, with the best obtained parameters in the experiments 
with both noise-free and noisy data. 

For Case Study 2, the obtained values of / resulted from 
20 runs with 22500 iterations by each algorithm are listed 
in Table 4. For the numerical experiments with noise-free 
data, the best result oij (J = 0.7358e-06) was obtained by 
using the SS method, which also had the better average 
performance than any other compared algorithm, as 
shown by the mean value of / over the 20 runs of the algo- 
rithm. The second best method was the RDPSO algorithm, 
which could converge to a value of / = 0.009124 and had a 
mean value of / = 0.178881 over 20 runs. Table 5 lists the 
estimated values of the model parameters corresponding 
to the best /value (7=0.009124) found by the RDPSO 
algorithm. The results also show that (^, A)-ES is the win- 
ner in this inverse problem compared to {fi + A)-ES, whose 



10-^ 



10' 



10' 



10" 



10 



- RDPSO 

(^,^>ES 

- (n+>.)-ES 

DE 

PSO 

SS 



0.5 



1.5 



X 10 



Figure 6 The figure visualizes the convergence curves of objective function values of all the algorithms averaged over 20 runs of the 
numerical experiments with noisy data in Case Study 2. It is sliown that tlie SS method had the fastest convergence speed and the RDPSO 
had the second fastest one. 
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Figure 7 The figure shows the predicted concentration provided by the RDPSO (continuous line) and noise-free experimental (marker) 
data in Case Study 2 when P = 1.0, S = 2.1544. It is shown that the predicted concentration had good correlation with the experimental data. 



best and mean results of /were 0.123209 and 2.141820, 
respectively. The PSO and DE are two well-known efficient 
population-based optimization methods, which, however, 
could not arrive at the vicinity of the aforementioned solu- 
tions. When the experimental data (simulation data) was 



noisy, the SS method and RPSO obtained similar results for 
the best / value over 20 runs, but the former had a better 
average algorithmic performance. Table 6 shows the identi- 
fied model parameters corresponding to the best / value 
(/ = 0.2313) obtained by the RDPSO for the noisy data. 
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Figure 8 The figure shows the predicted concentration provided by the RDPSO (continuous line) and noisy experimental (marker) 
data in Case Study 2 when P = 1.0, S = 2.1544. It is shown that the predicted concentration had good correlation with the experimental data. 



We plotted in Figure 6 the convergence curve of each 
method averaged over 20 runs. It is shown that the SS 
method had a remarkably better convergence rate than 
others, probably due to its local solver that can enhance 
the local search ability of the algorithm significantly. 
Figures 7a and Figure 8 show comparisons between the 



predicted data and the experimental (simulation) data 
for the decision vectors found by the RDPSO in both 
groups of numerical experiments (with the noise-free 
and noisy data, respectively). It can be observed that 
there is good correlation between the experimental and 
predicted data. 
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Conclusions 

In this paper, a variant of RDPSO algorithm was proposed 
and showed to be able to successfully solve two inverse 
problems associated with the thermal isomerization of 
a-pinene and a three-step pathway, respectively. The results 
indicate that the proposed RDPSO algorithm outperformed 
its PSO predecessors and some other competitors in the 
first problem, and also had the second best algorithmic per- 
formance among all the compared algorithms. 

Like other stochastic optimization methods, a possible 
drawback of the RDPSO method is the computational 
effort required. This is mainly because most of the com- 
putational time was spent on solving the forward pro- 
blem. One measure that can be taken is to incorporate 
the local search technique into the algorithm in order to 
accelerate its convergence speed. Another is to develop 
a parallelized RDPSO implementation to solve inverse 
problems on computer clusters to reduce the computa- 
tional cost to a reasonable level. Our future tasks will 
focus on these two ways of improving the algorithmic 
effectiveness of the RDPSO algorithm. 

Availability and requirements 

In the additional file 1 the source codes of five of the tested 
algorithms on the two benchmark systems are provided. It 
includes two file folds, one for benchmark system 1 and the 
other one for benchmark system 2. All the algorithms are 
programmed with C-i-i- in Microsoft Visual C-I-+ 6.0. 

In additional file 2 the data files for the five algorithms 
used in the two case studies are provided. For each case, 
the data corresponding to the best results generated by 
20 runs of each algorithm are provided in a .txt file. 

Additional material 



Additional file 1: Source. CodeThis fi c includes the source code of all 

tested algorithms on the two benchmark problems, programmed in C++ 
on Microsoft Visual VC++ 6.0. All the source codes are compressed into a 
single .rar file. 

Additional file 2: Data. This file includes the data of the best solution 
out of 20 runs of each tested algorithm. 
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